Search CORE

154 research outputs found

GRAPE for fast and scalable graph processing and random-walk-based embedding

Author: Cano Gutiérrez Carlos
Cappelletti Luca
Publication venue: Springer Nature
Publication date: 26/06/2023
Field of study

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding.National Center for Gene Therapy and Drugs based on RNA Technology, PNRR-NextGenerationEU program G43C22001320007United States Department of Health & Human Services National Institutes of Health (NIH) - USA NIH National Cancer Institute (NCI) U01-CA239108-02Transition Grant Line 1A Project NIMI PARTENARIATI H2020' 1R24OD011883-01United States Department of Health & Human Services National Institutes of Health (NIH) - USA U01-CA239108-02 DE-AC02-05CH11231United States Department of Energy (DOE)European Union (EU) Marie Curie Actions PSR2015-1720GVALE_01 PID2021-128970OA-I0

Repositorio Institucional Universidad de Granada

Concurrent role of metal (Sn, Zn) and N species in enhancing the photocatalytic activity of TiO2 under solar light

Author: Cappelletti Giuseppe
Cerrato Giuseppina
Falletta Ermelinda
Meroni Daniela
Pargoletti Eleonora
Rimoldi Luca
Turco Francesca
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

Het-node2vec: second order random walk sampling for heterogeneous multigraphs embedding

Author: Cappelletti Luca
Casiraghi Elena
Fontana Tommaso
Ravanmehr Vida
Reese Justin
Robinson Peter
Valentini Giorgio
Publication venue
Publication date: 05/01/2021
Field of study

We introduce a set of algorithms (Het-node2vec) that extend the original node2vec node-neighborhood sampling method to heterogeneous multigraphs, i.e. networks characterized by multiple types of nodes and edges. The resulting random walk samples capture both the structural characteristics of the graph and the semantics of the different types of nodes and edges. The proposed algorithms can focus their attention on specific node or edge types, allowing accurate representations also for underrepresented types of nodes/edges that are of interest for the prediction problem under investigation. These rich and well-focused representations can boost unsupervised and supervised learning on heterogeneous graphs.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

parSMURF, a high-performance computing tool for the genome-wide detection of pathogenic variants.

Author: Cappelletti Luca
Castrignanò Tiziana
Danis Daniel
Frasca Marco
Grossi Giuliano
Mesiti Marco
Petrini Alessandro
Re Matteo
Robinson Peter N
Schubach Max
Valentini Giorgio
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2020
Field of study

BACKGROUND: Several prediction problems in computational biology and genomic medicine are characterized by both big data as well as a high imbalance between examples to be learned, whereby positive examples can represent a tiny minority with respect to negative examples. For instance, deleterious or pathogenic variants are overwhelmed by the sea of neutral variants in the non-coding regions of the genome: thus, the prediction of deleterious variants is a challenging, highly imbalanced classification problem, and classical prediction tools fail to detect the rare pathogenic examples among the huge amount of neutral variants or undergo severe restrictions in managing big genomic data. RESULTS: To overcome these limitations we propose parSMURF, a method that adopts a hyper-ensemble approach and oversampling and undersampling techniques to deal with imbalanced data, and parallel computational techniques to both manage big genomic data and substantially speed up the computation. The synergy between Bayesian optimization techniques and the parallel nature of parSMURF enables efficient and user-friendly automatic tuning of the hyper-parameters of the algorithm, and allows specific learning problems in genomic medicine to be easily fit. Moreover, by using MPI parallel and machine learning ensemble techniques, parSMURF can manage big data by partitioning them across the nodes of a high-performance computing cluster. Results with synthetic data and with single-nucleotide variants associated with Mendelian diseases and with genome-wide association study hits in the non-coding regions of the human genome, involhing millions of examples, show that parSMURF achieves state-of-the-art results and an 80-fold speed-up with respect to the sequential version. CONCLUSIONS: parSMURF is a parallel machine learning tool that can be trained to learn different genomic problems, and its multiple levels of parallelization and high scalability allow us to efficiently fit problems characterized by big and imbalanced genomic data. The C++ OpenMP multi-core version tailored to a single workstation and the C++ MPI/OpenMP hybrid multi-core and multi-node parSMURF version tailored to a High Performance Computing cluster are both available at https://github.com/AnacletoLAB/parSMURF

The Jackson Laboratory: The Mouseion at the JAXlibrary

Unitus DSpace

GraPE: fast and scalable Graph Processing and Embedding

Author: Callahan Tiffany J.
Cappelletti Luca
Casiraghi Elena
Fontana Tommaso
Joachimiak Marcin P.
Mungall Christopher J.
Ravanmehr Vida
Reese Justin
Robinson Peter N.
Valentini Giorgio
Publication venue
Publication date: 12/10/2021
Field of study

Graph Representation Learning methods have enabled a wide range of learning problems to be addressed for data that can be represented in graph form. Nevertheless, several real world problems in economy, biology, medicine and other fields raised relevant scaling problems with existing methods and their software implementation, due to the size of real world graphs characterized by millions of nodes and billions of edges. We present GraPE, a software resource for graph processing and random walk based embedding, that can scale with large and high-degree graphs and significantly speed up-computation. GraPE comprises specialized data structures, algorithms, and a fast parallel implementation that displays everal orders of magnitude improvement in empirical space and time complexity compared to state of the art software resources, with a corresponding boost in the performance of machine learning methods for edge and node label prediction and for the unsupervised analysis of graphs.GraPE is designed to run on laptop and desktop computers, as well as on high performance computing cluster

arXiv.org e-Print Archive

GRAPE for fast and scalable graph processing and random-walk-based embedding

Author: Callahan Tiffany J
Cano Carlos
Cappelletti Luca
Casiraghi Elena
Fontana Tommaso
Joachimiak Marcin P
Mungall Christopher J
Ravanmehr Vida
Reese Justin
Robinson Peter N
Valentini Giorgio
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/07/2023
Field of study

Graph representation learning methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE (Graph Representation Learning, Prediction and Evaluation), a software resource for graph processing and embedding that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random-walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as competitive edge- and node-label prediction performance. GRAPE comprises approximately 1.7 million well-documented lines of Python and Rust code and provides 69 node-embedding methods, 25 inference models, a collection of efficient graph-processing utilities, and over 80,000 graphs from the literature and other sources. Standardized interfaces allow a seamless integration of third- party libraries, while ready-to-use and modular pipelines permit an easy-to- use evaluation of graph-representation-learning methods, therefore also positioning GRAPE as a software resource that performs a fair comparison between methods and libraries for graph processing and embedding

The Jackson Laboratory: The Mouseion at the JAXlibrary

Metronomic Oral Vinorelbine: An Alternative Schedule in Elderly and Patients PS2 With Local/Advanced and Metastatic NSCLC Not Oncogene-addicted

Author: Alessandroni Paolo
Baldelli Annamaria
Bracci Raffaella
Cappelletti Claudia
Catalano Vincenzo
Fedeli Stefano Luzi
Giordani Paolo
Graziano Francesco
Imperatori Luca
Laici Gianluca
Lippe Paolo
Rocchi Marco Bruno Luigi
Rossi David
Sarti Donatella
Tamburrano Tiziana
Publication venue: 'Anticancer Research USA Inc.'
Publication date: 01/01/2020
Field of study

The MILES and ELVIS studies showed that vinorelbine is one of the best options for elderly patients with advanced non-small-cell-lung cancer (NSCLC). Oral vinorelbine at standard schedule (60-80 mg/m2/weekly) has good activity in terms of response rates and progression-free survival. In recent years, a metronomic schedule of oral vinorelbine (40-50 mg/m2 three times a week, continuously) has been studied in phase II trials, especially in unfit and elderly patients. In the MOVE trial metronomic oral vinorelbine had a clinical benefit [partial response (PR)+stable disease (SD) >12 weeks] in 58.1% of patients with mild toxicity. On this basis, in 2017 we started a phase II study with metronomic oral vinorelbine in elderly (over 70 years) or unfit [Eastern Cooperative Oncology Group performance score (ECOG-PS) of 2] patients with locally/advanced and metastatic NSCLC. Primary aims were clinical benefit (PR+SD ≥6 months) and toxicity; secondary aims were progression-free survival and overall survival

Archivio istituzionale della ricerca - Università di Urbino

Volatile lipophilic substances management in case of fatal sniffing.

Author: Augsburger M.
Bottoni E.
Cappelletti S.
Ciallella C.
di Luca N.M.
Fiore P.A.
Giuliani N.
Romolo F.S.
Varlet V.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Death due to inhalation of aliphatic hydrocarbons such as butane and propane is a particularly serious problem worldwide, resulting in several fatal cases of sniffing these volatile substances in order to "get high". Despite the number of cases published, there is not a unique approach to case management of fatal sniffing. In this paper we illustrate the volatile lipophilic substances management in a case of a prisoner died after sniffing a butane-propane gas mixture from prefilled camping stove gas canisters, discussing the comprehensive approach of the crime scene, the autopsy, histology and toxicology. A large set of accurate values of both butane and propane was obtained by gas chromatography-mass spectrometry analyzing the following post-mortem biological samples: peripheral blood, heart blood, vitreous humor, liver, lung, heart, brain/cerebral cortex, fat tissue, kidney, and allowed an in depth discussion about the cause of death. A key role is played by following the proper sampling approach during autopsy

Serveur académique lausannois

Archivio della ricerca- Università di Roma La Sapienza

Sulfate source apportionment in the Ny-Ålesund (Svalbard Islands) Arctic aerosol

Author: Bazzano Andrea
Becagli Silvia
Bolzacchini Ezio
Caiazzo Laura
Cappelletti David
Ferrero Luca
Frosini Daniele
Giardi Fabio
Grotti Marco
Lupi Angelo
Malandrino Mery
Mazzola Mauro
Moroni Beatrice
Severi Mirko
Traversi Rita
Udisti Roberto
Viola Angelo
Vitale Vito
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Florence Research

Archivio istituzionale della ricerca - Università di Genova

Supervised learning with word embeddings derived from PubMed captures latent knowledge about protein kinases and cancer.

Author: Blau Hannah
Bocci Giovanni
Bult Carol J
Cappelletti Luca
Carmody Leigh
Casiraghi Elena
Coleman Ben D
Fontana Tommaso
George Joshy
Hansen Peter
Joachimiak Marcin
Mungall Christopher
Oprea Tudor I
Ravanmehr Vida
Reese Justin
Robinson Peter N
Rueter Jens
Valentini Giorgio
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/12/2021
Field of study

Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of \u3e530 PKs have been targeted by FDA-approved medications, and around 150 protein kinase inhibitors (PKIs) have been tested in clinical trials. We present an approach based on natural language processing and machine learning to investigate the relations between PKs and cancers, predicting PKs whose inhibition would be efficacious to treat a certain cancer. Our approach represents PKs and cancers as semantically meaningful 100-dimensional vectors based on word and concept neighborhoods in PubMed abstracts. We use information about phase I-IV trials in ClinicalTrials.gov to construct a training set for random forest classification. Our results with historical data show that associations between PKs and specific cancers can be predicted years in advance with good accuracy. Our tool can be used to predict the relevance of inhibiting PKs for specific cancers and to support the design of well-focused clinical trials to discover novel PKIs for cancer therapy

The Jackson Laboratory: The Mouseion at the JAXlibrary

eScholarship - University of California